Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
公平感知的学习主要关注单个任务学习(STL)。多任务学习(MTL)的公平含义直到最近才被考虑,并提出了一种开创性的方法,该方法考虑了每项任务的公平性准确性权衡以及不同任务之间的绩效权衡。我们提出了一种灵活的方法,而不是刚性公平 - 准确性的权衡表述,该方法通过选择哪个目标(准确性或公平性)来在每个步骤中进行优化。我们介绍了L2T-FMT算法,该算法是经过协作培训的教师网络;学生学会解决公平的MTL问题,而教师指示学生从准确性或公平性中学习,具体取决于每项任务更难学习的内容。此外,每项任务的每个步骤都使用该目标的动态选择可将权衡权重从2T减少到T,其中T是任务数。我们在三个真实数据集上进行的实验表明,L2T-FMT在最先进的方法上的公平性(12-19%)和准确性(最高2%)都提高了。
translated by 谷歌翻译
数据驱动的AI系统可以根据性别或种族等保护属性导致歧视。这种行为的一个原因是训练数据中的编码的社会偏见(例如,女性是不平衡的,这在不平衡的阶级分布情况下加剧(例如,“授予”是少数阶级)。最先进的公平知识机器学习方法专注于保持\ emph {总体}分类准确性,同时提高公平性。在类别的不平衡存在下,这种方法可以进一步加剧歧视问题,通过否认已经不足的群体(例如,\ Texit {女性})的基本社会特权(例如,平等信用机会)的基本权利。为此,我们提出了Adafair,一个公平知识的提升集合,可以在每轮的数据分布中改变数据分布,同时考虑到阶级错误,还考虑到基于部分集合累积累积的模型的公平相关性能。除了培训集团的培训促进,除了每轮歧视,Adafair通过优化用于平衡错误性能(BER)的集成学习者的数量,直接在训练后阶段解决不平衡。 Adafair可以促进基于不同的基于奇偶阶级的公平概念并有效减轻歧视性结果。我们的实验表明,我们的方法可以在统计阶段,平等机会方面实现平价,同时保持所有课程的良好预测性能。
translated by 谷歌翻译
由于决策越来越依赖机器学习和(大)数据,数据驱动AI系统的公平问题正在接受研究和行业的增加。已经提出了各种公平知识的机器学习解决方案,该解决方案提出了数据,学习算法和/或模型输出中的公平相关的干预措施。然而,提出新方法的重要组成部分正在经验上对其进行验证在代表现实和不同的设置的基准数据集上。因此,在本文中,我们概述了用于公平知识机器学习的真实数据集。我们专注于表格数据作为公平感知机器学习的最常见的数据表示。我们通过识别不同属性之间的关系,特别是w.r.t.来开始分析。受保护的属性和类属性,使用贝叶斯网络。为了更深入地了解数据集中的偏见和公平性,我们调查使用探索性分析的有趣关系。
translated by 谷歌翻译
最近的研究表明,用于公平感知机器学习的数据集用于多个受保护的属性(以下称为多歧视)通常是不平衡的。对于关键少数群体中通常代表性不足的受保护群体(例如,女性,非白人等),阶级不平衡问题更为严重。尽管如此,现有的方法仅着眼于整体误差歧视权衡取舍,忽略了不平衡问题,从而扩大了少数群体中普遍的偏见。因此,需要解决方案来解决多歧视和阶级不平衡的综合问题。为此,我们引入了一种新的公平度量,多最大的虐待(MMM),该措施考虑了(多属性)受保护的群体和阶级成员的实例,以衡量歧视。为了解决合并的问题,我们提出了一种提升方法,该方法将MMM成本纳入分销更新和培训后选择了精确,平衡和公平解决方案之间的最佳权衡。实验结果表明,我们的方法与最先进的方法的优越性在跨群体和类别的最佳平衡性能以及对少数族裔阶层中受保护群体的最佳准确性方面的优势。
translated by 谷歌翻译
Unsupervised learning-based anomaly detection in latent space has gained importance since discriminating anomalies from normal data becomes difficult in high-dimensional space. Both density estimation and distance-based methods to detect anomalies in latent space have been explored in the past. These methods prove that retaining valuable properties of input data in latent space helps in the better reconstruction of test data. Moreover, real-world sensor data is skewed and non-Gaussian in nature, making mean-based estimators unreliable for skewed data. Again, anomaly detection methods based on reconstruction error rely on Euclidean distance, which does not consider useful correlation information in the feature space and also fails to accurately reconstruct the data when it deviates from the training distribution. In this work, we address the limitations of reconstruction error-based autoencoders and propose a kernelized autoencoder that leverages a robust form of Mahalanobis distance (MD) to measure latent dimension correlation to effectively detect both near and far anomalies. This hybrid loss is aided by the principle of maximizing the mutual information gain between the latent dimension and the high-dimensional prior data space by maximizing the entropy of the latent space while preserving useful correlation information of the original data in the low-dimensional latent space. The multi-objective function has two goals -- it measures correlation information in the latent feature space in the form of robust MD distance and simultaneously tries to preserve useful correlation information from the original data space in the latent space by maximizing mutual information between the prior and latent space.
translated by 谷歌翻译
The usage of technologically advanced devices has seen a boom in many domains, including education, automation, and healthcare; with most of the services requiring Internet connectivity. To secure a network, device identification plays key role. In this paper, a device fingerprinting (DFP) model, which is able to distinguish between Internet of Things (IoT) and non-IoT devices, as well as uniquely identify individual devices, has been proposed. Four statistical features have been extracted from the consecutive five device-originated packets, to generate individual device fingerprints. The method has been evaluated using the Random Forest (RF) classifier and different datasets. Experimental results have shown that the proposed method achieves up to 99.8% accuracy in distinguishing between IoT and non-IoT devices and over 97.6% in classifying individual devices. These signify that the proposed method is useful in assisting operators in making their networks more secure and robust to security breaches and unauthorized access.
translated by 谷歌翻译
Multiple studies have focused on predicting the prospective popularity of an online document as a whole, without paying attention to the contributions of its individual parts. We introduce the task of proactively forecasting popularities of sentences within online news documents solely utilizing their natural language content. We model sentence-specific popularity forecasting as a sequence regression task. For training our models, we curate InfoPop, the first dataset containing popularity labels for over 1.7 million sentences from over 50,000 online news documents. To the best of our knowledge, this is the first dataset automatically created using streams of incoming search engine queries to generate sentence-level popularity annotations. We propose a novel transfer learning approach involving sentence salience prediction as an auxiliary task. Our proposed technique coupled with a BERT-based neural model exceeds nDCG values of 0.8 for proactive sentence-specific popularity forecasting. Notably, our study presents a non-trivial takeaway: though popularity and salience are different concepts, transfer learning from salience prediction enhances popularity forecasting. We release InfoPop and make our code publicly available: https://github.com/sayarghoshroy/InfoPopularity
translated by 谷歌翻译
Almost 80 million Americans suffer from hair loss due to aging, stress, medication, or genetic makeup. Hair and scalp-related diseases often go unnoticed in the beginning. Sometimes, a patient cannot differentiate between hair loss and regular hair fall. Diagnosing hair-related diseases is time-consuming as it requires professional dermatologists to perform visual and medical tests. Because of that, the overall diagnosis gets delayed, which worsens the severity of the illness. Due to the image-processing ability, neural network-based applications are used in various sectors, especially healthcare and health informatics, to predict deadly diseases like cancers and tumors. These applications assist clinicians and patients and provide an initial insight into early-stage symptoms. In this study, we used a deep learning approach that successfully predicts three main types of hair loss and scalp-related diseases: alopecia, psoriasis, and folliculitis. However, limited study in this area, unavailability of a proper dataset, and degree of variety among the images scattered over the internet made the task challenging. 150 images were obtained from various sources and then preprocessed by denoising, image equalization, enhancement, and data balancing, thereby minimizing the error rate. After feeding the processed data into the 2D convolutional neural network (CNN) model, we obtained overall training accuracy of 96.2%, with a validation accuracy of 91.1%. The precision and recall score of alopecia, psoriasis, and folliculitis are 0.895, 0.846, and 1.0, respectively. We also created a dataset of the scalp images for future prospective researchers.
translated by 谷歌翻译
Agile robotics presents a difficult challenge with robots moving at high speeds requiring precise and low-latency sensing and control. Creating agile motion that accomplishes the task at hand while being safe to execute is a key requirement for agile robots to gain human trust. This requires designing new approaches that are flexible and maintain knowledge over world constraints. In this paper, we consider the problem of building a flexible and adaptive controller for a challenging agile mobile manipulation task of hitting ground strokes on a wheelchair tennis robot. We propose and evaluate an extension to work done on learning striking behaviors using a probabilistic movement primitive (ProMP) framework by (1) demonstrating the safe execution of learned primitives on an agile mobile manipulator setup, and (2) proposing an online primitive refinement procedure that utilizes evaluative feedback from humans on the executed trajectories.
translated by 谷歌翻译